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ABSTRACT 

Suppose  Yn  is  a  sequence  of  i.i.d.  random  variables  taking  values  in  Y,  a  complete,  separable,  non-finite  metric 
space.  The  probability  law  indexed  by  9  e  0,  is  unknown  to  a  Bayesian  statistician  with  prior  |i,  observing  this 
process.  Generalizing  Freedman  [1965,  Annals  of  Mathematical  Statistics],  we  show  that  "generically"  (i.e.,  for  a 
residual  family  of  (0,  \i)  pairs)  the  posterior  beliefs  do  not  weakly  converge  to  a  point-mass  at  the  "tiue"  9. 
Furthermore,  for  every  open  set  G  c  6,  generically,  the  Bayesian  will  attach  probability  arbitrarily  close  to  one  to  G 
infinitely  often. 

The  above  result  is  applied  to  a  two-armed  bandit  problem  with  geometric  discounting  where  arm  k  yields  an 
outcome  in  a  complete,  separable  metric  space  Y^.  If  the  infimum  of  the  possible  rewards  from  playing  arm  k  is 
less  than  the  infimum  from  playing  arm  k',  then  arm  k  is  (generically)  chosen  only  finitely  often.  If  the  infimum  of 
the  rewards  are  equal,  then  both  arms  are  played  infinitely  often. 
JEL  Classification  Numbers:  022,  026,  211. 


1.  INTRODUCnON 

There  has  been  a  flurry  of  interest  in  studying  the  asymptotic  dynamics  of  Bayesian  learning  and  control  in 
economic  environments.  In  one  set  of  papers  including  Easley  and  Kiefer  (1988),  Easley  and  Kiefer  (1989),  Kiefer 
and  Nyarko  (1988),  Kiefer  and  Nyarko  (1989),  McLennan  (1987),  Feldman  and  McLennan  (1989),  and  Bikhchandani 
and  Sharma  (1990),  the  authors  analyze  single  agent  decision  problems  in  which  there  is  a  tradeoff  between  current 
period  expected  reward  and  the  expected  value  of  the  information  generated  by  the  current  period  action.  In  another 
strand  of  the  literature,  Blume  and  Easley  (1984),  Bray  and  Kreps  (1987),  Feldman  (1987a),  and  Feldman  (1987b) 
focus  on  properties  of  the  tail  of  the  sequence  of  beliefs  and  outcomes  for  economies  with  many  passively  learning 
agents.  Specifically,  these  latter  papers  consider  whether  Bayesian  learning  by  agents  with  a  correct  specification  of 
the  underlying  structure  but  uncertainty  regarding  the  parameter  values  is  a  sufficient  condition  to  assure  convergence 
to  a  stationary  rational  expectations  equilitnium. 

These  articles,  as  well  as  important  earlier  contributions  of  Cyert  and  DeGroot  (1973),  Rothschild  (1974),  and 
Townsend  (1978),  have  the  following  common  framework.  From  the  vantage  point  of  the  economic  actors,  the  set 
of  possible  complete  descriptions  of  the  relevant  time-invariant  economic  data  can  be  represented  as  a  separable 
metric  space  0  with  Borel  a-field  fl(9).  The  actors  in  the  model,  uncertain  as  to  the  "true"  6o  g  ©,  have  prior  beliefs 
II  on  (0,  fl(0))^  and  an  induced  probability  P^  on  an  infinite  horizon  outcome  space.  Denote  by  {|it)  the  sequence 
of  posterior  beliefs.  A  result  common  to  this  literature  is  a  "theorem"  that  with  probability  one  |Xt  =>  M<«.,  where  \i» 
is  the  posterior  probabililty  conditioned  on  the  limit  sub-a-field. 

In  most  of  the  recent  papers  (exceptions  are  Feldman  and  McLennan  (1989)  and  Bikhchandani  and  Sharma 
(1990)),  this  a.s.  convergence  is  established  by  using  the  fact  that  the  the  sequence  of  posterior  beliefs  are  a 
martingale  with  respect  to  the  probability  P^.  It  follows  from  the  Martingale  Convergence  Theorem  that  with  P^ 

probability  one,  the  Bayesian  beliefs  converge  to  some  (possibly  random)  limit  belief..  In  contrast,  consider  the 
distribution  of  outcomes  and  beliefs  with  respect  to  the  probability  measure  Pgq,  the  probability  induced  by  the 

"true"  parameter  6o.  Intuitively,  Pqq  is  the  belief  of  a  passive  observer  who  attaches  probability  one  to  Go  being  the 

truth.  One  might  also  inquire  as  to  whether  convergence  of  Bayesian  beliefs  is  obtained  with  respect  to  the  measure 
Pqq.  a  major  point  of  this  paper  is  to  stress  that  without  additional  conditions,  the  answer  is  negative. 

To  elaborate  on  this  distinction,  suppose  that  in  period  t  =  0,  1,  ...  ,  agents  observe  outcomes  in  a  separable 
metric  space  Y.  Given  the  behavioral  rules  of  the  agents,  each  parameter  value  0  e  0  induces  a  probability  measure 
Pq  on  the  product  space  Y°°.  The  prior  |j,  induces  a  measure  Pp.  on  Y**  defined  by  P|i(A)  =  I  Pe(A)P|j^(d0).  The 

application  of  the  Martingale  Convergence  Theorem  yields  a.s.  convergence  of  (Ht) .  where  the  a.s.  statement  is  with 
respect  to  the  probability  measure  P|i.  But  this  does  not  imply  that  for  any  particular  9  that  Ht  =>  ^oo  with  Pq 
probability  one,  even  if  9  is  in  the  support  of  |i. 

One  might  hope  to  estabUsh  a  result  that  for  a  "large"  class  of  priors,  posteriors  converge  for  a  "large"  class  of 
parameter  values.  When  0  can  be  embedded  in  finite-dimensional  Euclidean  space,  one  has  recourse  to  Lebesgue 
measure  m  (restricted  to  0)  as  a  natural  notion  of  size.  Then  if  m  «  |i.  the  exceptional  9  set  (9:  Pe((|it=/>  |i})  <  1}, 


has  m  measure  zero.  But  in  many  naturally  occurring  settings  6  is  not  finite  dimensional,  and  since  there  is  no 
infinite  dimensional  analogue  of  Lebesgue  measure,  a  measure-theoretic  critaion  is  unavailable. 

In  lieu  of  a  reference  measure  to  evaluate  size,  the  customary  procedure  is  to  resort  to  the  topological  notion  of 
category.  Residual  subsets  are  deemed  to  be  large  or  generic,  and  subsets  of  first  category  (which  are  complements  of 
residual  subsets)  are  regarded  as  small.  Freedman  (1965)  proved  that  when  outcomes  are  I.I.D.  taking  values  in  a 
countable  set,  that  for  a  residual  set  of  parameter  values  and  (niors,  post^ca*  beliefs  do  not  converge.  In  Section  3  of 
this  paper  we  extend  Freedman's  result  to  outcomes  in  non-finite,  complete,  separable  metric  spaces. 

Using  the  results  of  Section  3,  in  Sections  4  and  5  of  the  paper  we  analyze  a  two-armed  bandit  problem  with 
geometric  discounting  where  arm  k  yields  an  outcome  in  a  complete,  separable  metric  space  Y^.  If  the  infimum  of 

the  possible  rewards  from  playing  arm  k  is  less  than  the  infunum  from  playing  arm  k',  then  for  a  residual  family  of 
parameter  values  and  priors,  arm  k  is  with  Pg  probabilty  one  chosen  only  fmitely  often.  If  the  infimum  of  the 

rewards  are  equal,  then  both  arms  are  played  infinitely  often. 

2.  NOTATION  AND  MATHEMATICAL  PRELIMINARIES 

2.1.  Notational  Conventions  and  Definitions 

The  set  of  real  numbers  is  denoted  by  R.  If  X  is  a  topological  space,  then  the  Borel  a-field  is  denoted  by  5(X). 
The  set  of  probability  measures  on  (X.  B(X))  is  denoted  by  i'(X).  For  x  e  X,  the  Dirac  measure  6x  e  P(X)  is 
defined  by  5x(A)  =  1  if  x  e  A. 

If  (X,  d)  is  a  metric  space,  f:  X  -^  R  is  a  Lipschitz  function  if  for  some  K  <  «>,  supx,ty  {lf(x)  -  f(y)l/d(x,y))  < 
K.  If  f  is  Lipschitz,  the  Lipschitz  seminorm  IIHIl  is  defined  IIAIl  =  supx^y{if(x)  -  f(y)l/d(x,y)).  If  f  is  a  bounded 
Lipschitz  function,  the  bounded  Lipschitz  norm  is  IIHIbl  =  Hf'L  +  HA'-  where  llflU  denotes  the  usual  sup  norm.  The 
set  of  all  real-valued,  bounded  Lipschitz  functions  on  (X,  d)  is  denoted  by  BL(X,  d).  Endowed  with  the  bounded 
Lipschitz  norm,  BL(X,  d)  is  a  Banach  space  (see  e.g.  Dudley  (1989,  Section  1 1.2)). 

The  dual  bounded  Lipschitz  or  Dudley  medic  P  on  /'(X)  is  defined  by 
P(P,  Q)  =  sup{ljf  dP  -  jf  dQI:  llfllBL^  1). 

for  P,  Q  G  PCX).  If  X  is  separable,  P  metrizes  the  topology  of  weak  convergence  on  P(X).  Further  details  on  the 
properties  of  P  can  be  found  in  Dudley  (1966)  and  Dudley  (1989). 

2.2.  A  Brief  Review  ofBaire  Category  Theory 

For  ease  of  reference,  we  summarize  some  needed  facts  pertaining  to  Baire  category.  Standard  references  include 
Kelley  (1985,  pp.  200-203),  Oxtoby  (1980)  and  Royden  (1988,  Section  7.8).  Let  X  be  a  metric  space.  A  set  E  c  X 
is  nowhere  dense  if  E  has  empty  interior.  A  set  E  is  oi  first  category  or  meager  if  it  is  the  union  of  a  countable 
collection  of  nowhere  dense  sets.  If  a  set  is  not  of  first  category  then  it  is  of  second  category.  The  complement  of  a 
set  of  first  category  is  a  residual  set. 


According  to  the  Theorem  of  Baire  Royden  (1988,  Theorem  7.27),  if  X  is  a  complete  metric  space  then  the 
intersection  of  a  countable  family  of  open  dense  subsets  of  X  is  itself  a  dense  subset  of  X. 

3.  GENERIC  NONCOI^TVERGENCE  OF  POSTERIORS  WITH  I.I.D.  OUTCOMES 
IN  COMPLETE,  SEPARABLE  METRIC  SPACES 
3.1.  Assumptions  and  Results 

We  first  describe  an  index  set  A  and  a  sequence  Zi,  Z2, ...  of  i.i.d.  random  variables  defined  on  a  probability 
space  (E,  S,  Px)  where  A.  e  A.  The  natural  interpretation  will  be  that  the  outcomes  are  sequentially  observed  by  a 

Bayesian  statistician  for  whom  the  "true"  X  is  initially  unknown  and  has  prior  belief  \i.  Building  upon  the  work  of 
Freedman  (1965),  we  will  investigate  the  topological  size  of  the  set  of  pairs  (X,  \i)  for  which  the  sequence  of 
Bayesian  posterior  beliefs  converges  Px  a.s.  or  in  Px  probability  to  some  limit  posterior  belief. 

The  sequence  (Zn)  takes  values  in  Z,  a  non-finite,  complete,  separable  metric  space  with  Borel  a-field  B(Z).  The 
probability  distribution  of  Zn  is  an  element  of  A  =  {A,  e  P(Zy.  A,  «  v),  where  v  is  a  a-finite  measure  on  (Z,  B{Z)) 
with  non-finite  support  Without  loss  of  generality  we  work  in  representation  space  and  so  define  S  =  Z~,  S  =  B{Z) 
X  fi(Z)  X  ...  ,  and  Px  =  A.  X  A,  X  ...  .  The  function  Z,,:  I  ->  Z  is  the  projection  of  Z  onto  Z,  defined  by  Zn(zi,  Z2, 

...)  =  Zn. 

To  address  the  question  of  convergence  of  posterior  beliefs  we  need  topologies  on  A  and  /'(A).  We  will  make  use 
of  two  topologies  on  A,  the  total  variation  topology  ^  and  the  topology  of  weak  convergence.^.  ^  is  induced  by 

the  O  metric  di  defined  by  di(X,  X.')  =    hr"  -  ~r^  dv.  An  essential  fact  is  that  (A,  di)  is  a  complete,  separable 


f|dX    di/, 
J  dv       dv 


metric  space.  (Completeness  follows  from  the  completeness  of  L^Z,  B(Z),  v),  and  for  separability  see  e.g.,  Strasser 
(1985,  Lemma  4.1).)  In  contrast,  defining  Pa  as  the  Dudley  metric  on  A  (which  generates  the  topology  ^),  the 
metric  space  (A,  ^\)  is  separable,  but  not  complete.  Conveniently,  the  Borel  a-field  of  (A,  ^  is  the  same  as  the 
Borel  a-field  of  (A,  ^)  (see  Strasser  (1985,  Theorem  4.7).  So  without  ambiguity  we  can  denote  the  Borel  sets  of  A 
by  B(A). 

A  prior  distribution  is  a  probability  measure  |i  on  (A,  B(A)).  As  indicated  above,  informally  one  can  imagine 
that  there  is  a  Bayesian  statistician  who  may  not  know  the  "true"  X.,  but  has  a  prior  n.  Since  A  has  two  topologies, 
there  are  two  weak  topologies  on  f  (A)  denoted  in  the  obvious  way  by  3^  and  3^,  with  ^  weaker  than  5?*f  •  ^ 
and  9f  generate  the  same  a-field,  which  we  denote  by  B{P(A)).  Convergence  with  respect  to  the  3*f  topology  is 

W 

denoted  by  ^>.  Convergence  with  respect  to  the  SR^  topology  is  denoted  by  ^.  In  this  section  of  the  paper  the 
symbol  |3  denotes  the  Dudley  metric  on  f  (A)  with  respect  to  the  di  metric  on  A..  It  follows  from  Billingsley  (1968, 
p.  239)  and  Dudley  (1989,  Corollary  11.5.5))  that  the  metric  space  (P(A),  P)  is  complete  and  separable  with  p 
generating  the  topology  9f . 

The  updating  rule  f:  P{A)  x  Z  ->  P(A)  is  a  measurable  function  with  the  property  that  for  each  \i  e  /'(A), 
r(^,  •)  is  a  regular  version  of  conditional  probability  with  respect  to  the  prior  probability  [i.  The  existence  of  such  a 


function  is  established  by  Dynkin  and  Yushkevich  (1979,  p.  263).  The  n-period  updating  rule  is  Fn:  PiA)  x  Z  -» 
P(A),  recursively  defined  by  ri(|i,  a)  =  TQi,  Zi(a))  and  TnCn,  <J)  =  nPn-iCn,  a),  Zn(a))  for  n  >  2. 

A  pair  (A.,  ji)  g  A  x  P{A)  is  .^Y-co/iiwre/ir  if  Px,((cr:  TnC^i,  a)  =*  8x,})  =  1.  A  pair  (X,  |i)  g  A  x  P(A)  is  .^- 
consistent  if  Px({ct:  TnOi,  a)  =>  6x.})  =  1.  (When  v  has  countable  support,  ^  =  ^  and  so  the  two  definitions  of 
consistency  are  identical.)  Freedman  (1%3)  proved  that  when  Z  is  finite  and  \i  has  full  support,  that  (A,,  ji)  is 
consistent  for  all  X  e  /"(A).  It  would  be  natural  to  conjecture  a  similar  result  for  when  Z  is  not  finite.  Indeed,  the 
consistency  result  for  the  finite  outcome  case  has  been  generalized  in  a  well-known  paper  of  Schwartz  (1965)  and 
more  recendy  by  Barron  (1988).  However,  Freedman  (1965)  demonstrated  that  in  a  topolological  sense  "most"  pairs 
are  not  consistent  (in  either  sense)  when  v  has  countable  support,  even  if  is  required  that  the  prior  p.  has  full  support 
More  precisely,  defining  S  =  (p.  e  /"(A):  supp  H  =  A) ,  Freedman  proved  the  striking  result  that  there  exist  sets  R\  c 
A  and  Rp{\)  c  S,  residual  in  A  and  P{\)  respectively  (which  implies  that  R\  x  Rp{\)  is  residual  in  A  x  f  (A)),  such 
that  fOT (X, ]x)e^  R\y. Rp{\): 

limsup  IfnOJ.,  ct)(G)  Px(da)  =  1,  for  all  nonempty  open  subsets  G  c  A.  A  corollary  is  that  for  (X,  n)  in  the 
n->«*  J. 

residual  set  ^a  x  ^p(a),  PX({ct:  ^n(^l,  a)  =>  5x))  =  0. 

In  the  next  subsection  we  show  that  these  non-convergence  results  of  Freedman  (1965)  extend  to  the  case  where 
v  is  any  a-finite  measure  with  non-fmite  support.  Endowing  A  with  the  di  metric,  and  defining  ^  =  [(k,  |j.)  e  A  x 
P{\):  limsm)  \Tn(n.,  a)(G)  Px(da)  =  1  for  all  open  G  c  A) ,  we  prove  the  following  theorem  and  corollary. 

THEOREM  3.10.  .58  is  a  residual  subset  of  A  x  f  (A).  Furthermore,  defining  S  =  (p  e  P{K):  supp  n  =  A)  and  JSs  = 
i?  n  (A  X  S),  .^s  is  a  residual  subset  of  A  x  P{\). 

COROLLARY  3.11.  All  (X,  n)  G  .^  are  neither  .^  ot  5^-consistenL 

3.2.  Proof  of  Theorem  and  Corollary 

While  the  basic  structure  of  the  proof  is  closely  resembles  the  proof  of  Freedman  (1965)  of  his  Theorem,  some 
modification  and  extension  is  required  to  adapt  the  argument  to  cover  a  non-discrete  outcome  space.  In  particular. 
Proposition  3.5  requires  a  different  method  of  proof  than  the  comparable  intermediate  result  in  Freedman  (1965). 

We  start  with  some  definitions.  Unless  otherwise  indicated,  where  relevant  it  should  be  understood  that  A  is 
endowed  with  the  ^  topology.  Define  h:  A  x  Z  ^  R  as  (fl(A)  x  5(Z))  measurable  function  such  that  h(A„  •)  is  a 
density  for  X  with  respect  to  v  (a  proof  of  existence  of  such  a  function  is  provided  by  Strasser  (1985,  Lemma  4.6)). 
Let  A+  =  {X.  e  A:  h(X,  z)  >  0,  v  a.e.)  and  define  Aq  =  -A+.  The  set  of  probability  measures  on  (A,  B(A))  that 
assign  strictly  positive  probability  to  A+  is  P+{h)  =[\i&  /'(A):  |i(A+)  >  0). 

A+  and  /'+(A)  are  topologically  "large"  in  the  sense  that  each  is  a  residual  subset  of  a  complete,  separable  metric 
space.  In  contrast,  Aq  and  /'(Aq)  =  (h  g  P(A):  h(Ao)  =  1)  are  of  first  category,  albeit  dense  in  respectively  A  and 

PW. 


LEMMA  3.1.  A+  c  A  and  P+(A)  c  P(A)  are  respectively  dense  G5  (and  hence  residual)  subsets  of  A  and  P+(,A).  Aq 

c  A  and  P(.Aq)  c:  P(A)  are  dense  sets  of  first  category. 

Proof,  (i)  We  first  establish  the  properties  of  A+  and  P(A+).  Define  AJ  =  {X  e  A:  v({z:  h(k,  z)  =  0))  <  j'^ )  for  j  =  1, 

2 aJ  is  an  open,  dense  subset  of  A.  So  by  the  Baire  Category  Theorem  (see  e.g.,  Royden  (1988,  Theorem 

7.27))  A+  =  n.^  AJ  is  a  dense  G5  set  and  hence  residual  (Royden  (1988,  Theorem  7.30)). 

The  claim  that  P+(,A)  is  a  dense  G5,  follows  from  Theorem  3.15  of  Dubins  and  Freedman  (1964).  (Dubins  and 
Freedman  have  a  compactness  assumption  in  Section  3  of  their  paper,  but  inspection  of  their  proof  reveals  that 
completeness  and  separability  is  a  sufficient  condition.) 

(ii)  We  now  verify  the  properties  of  Aq  and  P{Aq).  Since  Aq  =  -A+  and  /'(Aq)  =  ~P+iA),  and  A+  and  P+{A)  are 
residual,  by  definition  Aq  and  /'(Aq)  are  of  first  category.  To  prove  denseness,  for  arbitrary  A,  e  A  choose  a  sequence 
X^  -*  A,  such  that  h(k^,  z  )  =  0  on  a  set  of  positive  v  measure  and  h(X.^,  ■ )  converges  in  v  measure  to  h(k,  ■).  (The 
existence  of  such  a  sequence  follows  from  the  fact  that  for  every  a  e  [0,  1]  there  exists  Aq  g  B(Z)  such  that  a  = 

|h(X,  z)  v(dz).)  But  then  X^  -^  X,  establishing  the  denseness  of  Aq.  The  density  of  /'(Aq)  now  follows  from 
Aa 

Theorem  11.6.3  of  Parthasarathy  (1967).  ■ 

Let  D  =  (ai,  02, ...  )  be  a  countable,  dense  subset  of  A+  and  hence  dense  subset  of  A.  We  now  construct  a 
sequence  Mi,  M2, ...   with  M^  c  P{A)  such  that  for  all  A.  e  A+  and  |i.  e  M^,  fnOi,  ct)  converges  Px  as.  to  Saj^, 

the  Dirac  measure  on  Uk-  Proceeding,  we  define  M^  c  P(A)  by  M^  =  (n  e  /'(A):  (i)  \i  has  finite  support,  (ii) 
^({ak})  >  0,  and  (iii)  \i{Ao)  =  1  -  n({ak) }.  The  set  M  c  f  (A)  is  defined  by  M  =  Uj^^j  Mk- 

LEMMA  3.2.  For  k  =  1, 2, ...  ,  Mk  «  a  dense  subset  ofP{A). 

Proof.  Select  a^e  D,  p.  e  P{A)  and  define  Z  =  (y  e  P(A):  supp  y  is  finite,  and  7(Ao)  =  1 ) .  Since  Aq  is  dense  in  A 

(Lemma  2.1),  by  Theorem  11.6.3  of  Parthasarathy  (1967),  E  is  dense  in  P(A).  So  there  exists  7^  =>  |i.,  with  y"  e 
H.  Define  ji'"  =  m"l-5a^  +  (1  -  m"^)-7^.  Since  p.'"  e  Mk  and  ji"'  =>  (i,  the  proof  is  complete.  ■ 

Given  the  above  definitions,  it  is  intuitive  that  if  the  prior  |i.  e  Mk  and  Xq  e  A+,  then  with  P\q  probability 
one,  any  a  priori  alternative  to  Xq  will  eventually  be  deemed  impossible  and  the  posterior  belief  will  converge  to 
8ak- 

LEMMA  3.3.  For  Xoe  A+andiie  Mk,  Pf^Ho  €  Z:  rn(M,  o)  =>  Sa^})  =  1. 

Proof  Suppose  supp  \i  =  (Ok,  Xi,  X2, ...,  Xj]  where  Xj  g  Aq  for  j  =  1 J.  Let  Aj  =  (z  g  Z:  h(Xj,  z)  =  0).  Define 

the  exceptional  set  Ej  =  (a  =  (zi,  Z2, ...)  g  Z:  Zk  «  Aj  for  k  =  1,  2, ...).  PXo(Ej)  =  0.  and  fora  «  Ej,  TnOi.  cr)({Xj)) 

>  0  only  finitely  often.  Since  {Xi, ...,  Xj)  is  finite,  fndi,  cT)({ak))  <  1  only  finitely  often.  ■ 
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Now  for  k  =  1, 2, ...  ,  let  [Oicm]jj^i  be  a  decreasing  sequence  of  open  subsets  of  A  with  6^^+;  c  Oicm  and 
''^r^lOkm  =  [o-k.]-  Let  gkm:  A  -*  R  be  a  bounded  Lipschitz  function  such  that  llgkin"BL  ^  1.  gkm  equals  one  on 
Ok,m+l  and  vanishs  on  -Okm-  The  existence  of  functions  satisfying  these  conditions  follows  from  Proposition 
11.2.3  of  Dudley  (1989). 

LEMMA  3.4.  //Xo  e  A+and\ie  Mfc,  then   Lirn^   C[  Jgkm(^)  TnOi,  a)(dX)]  n^ida)  =  1. 

Q 

/'roo/.  Define  GicnmCcf)  =  |gkm(^)rn(^,  crXdX).  Applying  Lemma  3.3  and  the  definition  of  weak  convergence, 
A 

Lim  GjonnC*')  =  1.  PXfl  a.s..  So  by  the  Dominated  Convergence  The<Mem,  Lim  j  Gkjnn(cf)  PXo(da)  =  1.  ■ 

For  arbitrary  prior  beliefs  |i.  e  P(A)  and  'true'  parameter  X  g  A,  the  probability  law  (with  respect  to  the 
measurePx  )  of  the  posterior  mapping  Fndi,  ■):  S  -»  P{A)  may  vary  with  the  choice  of  versions  of  conditional 
probability.  It  is  easily  confirmed,  however,  that  if  n  g  P+(A)  then  with  P\  probability  one,  any  two  versions  of 
conditional  probability  will  agree.  The  next  task  is  to  verify  that  if  we  restrict  attention  to  prior  beliefs  n  g  P+iA), 
then  from  the  perspective  of  statistical  observer  who  "knows"  X,  the  expected  value  of  the  Bayesian's  posterior 
expectation  of  gkm  is  a  continuous  function  of  the  prior  \i  and  the  true  parameter  \.  This  is  a  lengthy  exercise  with 
the  details  provided  in  the  Appendix.  Since  P+iA)  is  a  residual  set,  for  the  purposes  of  this  paper  this  restricted 
continuity  result  suffices. 

PROPOSITION  3.5.  The  function  OKnu,:  A  x  P+(A)  ->  R  defined  by 


Z 
Proof.  See  Appendix. 


*^)cmn(^.  \i-)=    \  JglunC^')  TnOi,  o)(.dX')  Px(da),  is  continuous  for  all  k,  m  and  n. 

'a 

Z 


The  remaining  steps  needed  for  the  proof  of  Theorem  3. 11  mimic  Freedman  (1965).  To  make  the  p^)er  self- 
contained,  modulo  notational  changes  (and  filling  in  some  details)  we  replicate  Freedman's  ingenious  argument 
Define  for  k,  j,  m,  n  =  1,  2, ...  ,  the  set  RKmjn  c  A  x  P+(A)  by: 

RKmjn  ={(>..  ^i)  G  A  X  P+{A):    ([  jgkm(X')  T^i^y,  a)(dX')]  Px(da)  <  1  -  j'^ } . 

Z 
And  define  5^  \j^^  ^m=l  "^jr*!  '~^^\  ^Kmjn-  The  set  of  (X,  \i)  pairs  such  that  with  P^  probability  one,  the 

Bayesian's  posterior  belief  essentially  concentrates  in  every  open  A  set  infinitely  often  is: 
i?  =  {(X,  ^)  G  A  X  P{A):  limsup  Jrn(u,  ci)(G)  Px(da)  =  1,  V  open  G  c  A). 

To  aid  the  reader,  we  outline  the  structure  of  the  remainder  of   the  proof.  In  Proposition  3.6  and  3.7  we 
establish  that;  (i)  9^is  of  first  category  in  A  x  /'+(A),  and  (ii)  that  [A  x  /'+(A)]\^  c  9^implying  that  iS  is 
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residual  in  [A  x  P+(A)].  In  conjunction  with  the  fact  that  a  residual  subset  of  a  residual  subspace  is  residual  (a 
consequence  of  Lemma  3.9),  this  establishes  that  ^  is  residual  in  A  x  P(A). 

PROPOSITION  3.6.  For  all  k,m,j>l,  Oj^^j  Ricmjn  "  ^  relatively  closed,  nowhere  dense  subset  of  Ax  P+{A).  9^ 

is  of  first  category  tn  A  x  P+(A). 

Proof.  By  Proposition  3.5,  RKmjn  is  closed  in  A  x  P+(A)  and  so  n^-^  RKmjn  is  closed  in  A  x  P+(A).  By  Lemma 
3.4,  if  (X,  n)  €  A+  X  Mk  then  (X,  |i)  «  n^~j  RKmjn-  By  Lemma  3.2  A+  x  Mfc  is  dense  in  A  x  P+(A),  and  so 
'^n=i  ^Kmjn  is  nowhere  dense  in  A  x  P+(A).  Since  r^J^■^  Ricmjn  is  a  closed,  nowhere  dense  subset  of  A  x  P+(A), 
9^is  a  countable  union  of  closed,  nowhere  dense  sets,  and  so  is  of  first  category  in  A  x  P+(A).  ■ 

PROPOSITION  3.7.  [A  X  P+(A)]  \i?  c  ^T 

Proof.  Suppose  (A.,  p.)  e  [Ax  P+{A)]\.9S .  Then  there  exists  e  >  0  and  an  open  set  G  c  A,  such  that 

limsupn  jrn(p,  CT)(G)  Px(da)  <  1  -  e.  Choose  k  and  m  so  that  {X':  gkm(^')  >  0)  c  G.  Choose  j  so  that 

n 

1  -  j-l  >  1  -  e  and  choose  no  so  that  for  all  ji  >  no,  jrn(|i,  ct)(G)  Px(da)  <  1  -  j"^.  Then  (k,  n)  e  Rkmjn  and  so 
(X.,  \i)  €  (^n^  Ricmjn.  which  by  the  definition  of  9'lmplies  that  (X,  |i)  g  3?"  ■ 

COROLLARY  3.8.  i?  n  [A  x  /'+(A)]  w  a  residual  subset  of  Ax  P+{A)  in  the  relative  topology. 

Proof  By  Propositions  3.6  and  3.7,  [A  x  P+{A)]\.^  is  of  first  category  in  A  x  P+(A),  so  the  relative  complement 
Rn[Ax  P+{A)]  is  residual  in  A  x  P+(A).  U 

LEMMA  3.9.  Suppose  (Y,  T)  is  a  topological  space  and  (Yq,  U)  is  a  subspace  where  U  is  the  relative  topology. 

(i)If  AcYQis  nowhere  dense  in  (Yq,  U)  then  A  is  nowhere  dense  in  (Y,  T). 

(ii)  IfYo  is  a  residual  subset  ofY  and  B  c  Yq  is  residual  in  (Yq,  U)  then  B  is  a  residual  subset  ofY. 

Proof,  (i)  Suppose  A  is  not  nowhere  dense  in  Y.  Then  there  is  an  open  set  G  c  Y  with  G  c  A.  Furthermore, 

G  n  Yq  c  A  n  Yq  =  cIyqA,  the  closure  of  A  in  (Yq,  f/),  where  the  equality  follows  from  Theorem  1.16  of  Kelley 

(1985).  But  this  contradicts  A  being  nowhere  dense  in  Yq. 

(ii)  To  establish  that  B  is  residual  in  Y,  observe  that  there  exists  a  sequence  of  sets[Ai)  with  Ai  c  Yq,  YqNB  =  ui 

A,,  and  Ai  nowhere  dense  in  (Yq,  U).  So  by  (i).  A,  is  nowhere  dense  in  Y,  implying  that  Yo\B  is  of  first  category  in 

Y,  so  Yo\B  u  Y\Yo  is  of  first  category  in  Y,  and  B  =  ~(Yo\B  u  Y\Yo)  is  residual  in  Y.  ■ 

/'roo/o/ THEOREM  3.10.  By  Lemma  3.1,  P+(A)  is  a  residual  subset  of  /"(A).  So  by  Theorem  15.3  of  Oxtoby 
(1980),  A  X  P+{A)  is  a  residual  subset  of  A  x  P{A).  By  Corollary  3.8,  3S  n  [A  x  P+{A)]  is  a  residual  subset  of  A 
X  P(A),  and  so  by  Lemma  3.9,  .^  n  [A  x  P+(A)]  is  a  residual  subset  of  A  x  P{A).  This  completes  the  proof  that 
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^  is  a  residual  subset  of  A  x  f  (A).  Applying  Theorem  3.13  of  Dubins  and  Freedman  (1964),  S  is  a  residual  subset 
of  P(A),  and  by  Oxtoby  (1980,  Theorem  15.3)  A  x  S  is  a  residual  subset  of  A  x  P(A).  So  i?  n  (A  x  S)  =  ^S  is 
residual.  ■ 

/'roo/o/ COROLLARY  3.11.  Pick  (X,  \i)e  ^,E>0,  and  choose  a  set  G  c  A  open  in  the  ^  topology  with  ^- 
closure  G  such  that  X  e  G.  Since  (X,  \i)  e  ^,  P\  a.s.  rn(M.  cr)(G)  >  1  -  e  infinitely  often,  rn(n,  a)(_  G)  <  e  i.o.. 

But  by  the  standard  characterization  of  weak  convergence  (see  e.g.,  Billingsley  (1968,  Theorem  2.1))  this  implies 
that  Tndi,  o)  does  not  converge  =>  to  5x..  And  since  9^  is  weaker  than  SPf ,  TnOJ.,  a)  does  not  converge  =>  to  5x.  ■ 

4.  AN  APPUCATION:  INHNITE  HORREDN  BANDIT  PROBLEMS  WITH  DISCOUNTEsfG 

4.1.  Introduction 

In  this  section  we  model  a  Bayesian  decision-maker  who  faces  an  infinite-horizon  two-armed  bandit  problem  and 
geometrically  discounts  future  rewards.  In  a  well-known  article  Rothschild  (1974)  applied  the  bandit  framework  to 
model  the  decision-making  of  a  monopolist  who  could  charge  one  of  two  prices  and  was  uncertain  of  the  distribution 
of  demand  associated  with  each  price.  An  extended  discussion  of  economic  applications  of  bandit  problems  in 
provided  by  Kiefer  (1989) 

To  orient  the  reader  we  first  provide  an  informal  description  of  the  Bayesian's  optimization  problem.  We  then 
formally  define  the  relevant  probability  spaces  and  reformulate  the  decision-problem  as  a  dynamic  programming 
problem.  Using  the  Gittens  Index,  we  provide  a  simple  characterization  the  behavioral  rules  of  the  decision-maker.  In 
Section  5,  we  apply  Theorem  3.10  to  describe  the  asymptotic  behavior  of  the  decision-maker. 

Time  periods  are  indexed  by  t  =  0, 1 , 2, ....  In  period  t  the  decision-maker  selects  an  action  or  bandit-arm  x(t)  € 
X=  {xi,X2).  After  choosing  action  x(t)  =  xk,  the  realization  y(t)e  Y^  of  a  random  element  Y(t)  is  observed,  and  a 
period  reward  rk(y(t))  is  received.  Conditional  upon  the  Bayesian  choosing  x(t)  =  xj^,  the  probability  distribution  of 

Y(t)  is  an  element  of  9k  e  e^  c  /'(Y^).  Defining  Y  =  Yi  u  Y2,  and  r:  X  x  Y  -*  R  by  r(xk,  y)  =  rk(y),  the  total 

00 
reward  or  utility  from  the  stream  (x(0),  y(0),  x(l),  y(l),  x(2), ...  )  is  X"''r(x(t),  y(t)),where  the  discount  factor  a  € 

t=0 

[0,  1). 

The  decision-maker  may  initially  be  uncertain  of  the  "true"  9i  and  62.  Defining  ©  =  ©i  x  ©2.  her  initial  beliefs 
are  given  by  a  prior  probability  \l  g  P(©).  Since  the  decion-maker's  choice  of  action  at  time  t  may  be  influenced  by 
previously  observed  random  outcomes,  the  action  x(t)  is  the  realization  of  a  random  variable  X(t).  A  policy  is  a 
sequence  of  random  variables  {X(t))  taking  values  in  X,  that  are  measurable  with  respect  to  the  information  (i.e., 

sub-a-fields)  generated  by  past  outcomes.  The  objective  of  the  decision-maker  is  to  select  a  policy  that  maximizes 

00 

E^[Xa'"^r(X(t),  Y(t))],  where  the  symbol  E^l  (informally)  denotes  expectation  with  respect  to  a  probability 
t=l 

measure  on  the  underlying  probability  space  consistent  with  the  prior  \l. 
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4.2.  Formal  Specification  of  the  Probability  Spaces 

The  set  Yjc  of  outcomes  that  might  arise  from  selecting  action  xk  is  a  non-finite,  complete,  separable  metric 
space  with  Borel  a-field  B(Y^.  Define  ^  =  Y\^xY^x...  with  F^  the  corresponding  product  a-field.  We  write  F^, 
for  the  n-fold  product  a-field  B(Yk)  x  B(Yk)  x  . . .  x  fi(Yk)  with  F^o  =  {Qk.  0 )  •  The  measurable  space  (Q,  F)  is 
defined  by  £i  =  Qi  x  Q2.  and  F  =  F;  x  F2.  The  projection  function  n^:  N  x  Q^  ^  Yk  is  defined  by  7tk(j,  (Oki. 
C0k2.  •••  )  =  t^kj-  The  interpretation  of  Qk  'S  that  for  (Ok  e  Qk.  JtlcCJ.  'Ok)  is  the  outcome  on  the  j'th  occasion  that 
action  Xk  is  selected.  Slightly  abusing  notation,  when  convenient  we  treat  n^Q,  •)  as  a  function  from  i2  to  Y  defined 

by  TCkCJ.  wi.  0)2)  =  rtkO.  c^j)- 

The  set  of  a  priori  possible  probabiUty  laws  on  (Yk.  F^)  is  Bk  =  {9k  e  ^(60:  Ok  «  vk) .  where  Vk  is  a  a-finite 
measure  on  (Yk,  fi(Yk))  with  non-finite  support  The  reward  function  rk:  Yk  -^  R  is  a  bounded,  measurable  function 
with  ess  inf  rk(yk)  =  sup{a  e  R:  vk({yk:  n^\d  <  a))  =  0)  =  bk,  and  ess  sup  HcCyk)  =  inf  (a  g  R:  Vk({yk:  HcCyk)  >  a)) 
=  0)  =  Ck-  We  define  lltkll  =  max  {Ibkl,  Ickl}. 

With  dk  the  L^  metric  on  Gk,  (©k.  dk)  is  a  complete,  separable  metric  space.  For  0k  e  0k,  the  product  measure 
6k  X  Ok  X  ...  on  (Qk,  Fk)  is  denoted  by  Pk,ek-  The  product  space  (9,  d©)  is  defined  by  9  =  61  x  62  where  de  is  a 
metric  that  metrizes  the  product  toplogy.  The  Borel  a-field  is  B(e)  =  B(ei)  x  ^(82).  For  9  =  (Gi,  62)  e  6,  the 
product  probability  measure  Pe  on  (i^,  F)  is  defined  by  Pe  =  Pi.9i  x  P2,e2- 

In  order  to  make  use  of  Gittens  Index  machinery,  we  require  that  observing  an  outcome  from  arm  k  provides  no 
information  regarding  the  probability  law  governing  arm  k',  for  k  ^  k'.  More  formally  the  prior  belief  must  be  a 
product  probability  on  (9,^(9)).  Define  n(9)  =  {n  €  F(9):  ^  =  Hi  x  ji2,  s.t.  m  e  F(9i),  ^2  e  Pi^l)]- 
Identifying  |i  =  |J.i  x  |j.2  g  ri(9)  by  the  vector  (jii,  112)  of  marginals,  no  confusion  should  arise  from  writing  (jii, 
|i2)  instead  of  \i\  x  H2-  The  projection  functions  Pk:  n(9)  -^  F(9k)  are  defined  by  Pk((M^l.  [i-l))  =  Wc- 

Corresponding  to  a  prior  \i  are  induced  probability  measures  P^  and  Q^  on  respectively  (0,F)  and 

(9  X  Q,  B(9)  X  F).  For  h  =  (m,  H2)  g  11(9),  Q^  is  defined  on  measurable  rectangles  by  Q^(A  x  B)  = 

fPe(B)  MXd9)  for  A  G  fl(0)  and  B  g  F.  By  the  Product  Measure  Theorem  (see  Ash  (1972,  Theorem  2.6.2))  there  is 
A 

a  unique  extension  onto  5(9)  x  F.  P^  is  defined  by  P^(B)  =  P^(9  x  B)  for  every  B  g  F.  From  the  perspective  of 
the  Bayesian  decision-maker  who  has  prior  |i,  the  relevant  probability  spaces  are  (il,  F,  P^)  and  (9  x  Q,,  B(9)  x  F, 
Q^).  But  from  the  perspective  of  a  classical  statistician,  the  probability  space  on  which  all  random  variables  are 
defined  is  (fi,  F,  Pq)  where  9  =  (9i,  92)  e  9  is  the  "u^e"  parameter. 

4.3.  Histories  and  Policies 

We  now  provide  a  precise  statement  of  the  optimization  problem  from  the  perspective  of  the  decision-maker. 
We  start  by  defining  an  admissible  plan.  It  will  be  convenient  to  include  in  our  definition,  "count  functions" 
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Ck(t):  Q  -^  {0,  1,  2, ...  ),  for  k  =  1,  2  and  t  =  0,  1,  2, ...  .  The  realization  Ck(t){(o)  will  be  interpretable  as  the 
number  of  occasions  that  arm  k  has  been  chosen  through  time  L 

The  indicator  function  I^:  X  -»  {0,  1 )  is  defined  by  l^(x)  =  1  iff  x  =  x^.  An  admissible  plan  is  a  tuple  [{X(t)) , 
{Y(t)).  {Ci(t)).  {C2(t)),  [H,]]  where  for  t  =  0,  1,  2, ...  .  X(t):  Q  ->  X,  Y(t):  Q  -»  Y,  Ck(t):  Q  -*  (0,  1,  ...  ). 
and  [Ht]  is  a  sequence  of  sub-a-fields  of  F ,  such  that 

i)  Ho  =10,0) 

ii)  X(t):  0->X  isHt  measurable, 

iii)  for  t  >  1,  //,  =  //,./  V  a(X(t-l),  Y(t-l)). 

iv)Ck(0)  =  Ik(X(0)). 

v)  Ck(t)  =  Ck(t-l)  +  Ik(X(t))  for  t  >  1.  and 

vi)  if  X(t)(o))  =  Xk  and  Ck((o)  =  n,  then  Y(t)((coi,  (02))  =  Jtk(n.  ^k). 

For  a  G  (0,  1).  an  admissible  plan  [{X(t)},  { Y(t)).  (Ci(t)),  {C2(t)],  [H,]]  is  (11.  a)-optimal  if  for  any  other 
admissible  plan  [{X(t)').  {Y(t)').  {Ci(t)'}.  (€2(1)'),  {///}]: 


[Sat-iT(X(t)((o),  Y(t)(a)))]  P^,{d(o)  > 
t=l 


[IaH-r(X(t)'(co),  Y(t)'((o))]  Pn(d(o).' 
t=l 


4.4.  Reformulation  as  a  Bayesian  Dynamic  Programming  Problem 

In  this  section  we  recast  the  decision-maker's  optimization  problem  as  a  Bayesian  dynamic  programming 

problem  with  the  state  at  time  t  being  the  Bayesian's  posterior  probability  |J.(t)  =  (iti(t),  |i2(0)  ^  n(e).  As  in 

Blackwell  (1965),  a  dynamic  programming  problem  is  defined  as  a  quintuple  [X,  u,  n(9),  9,  a],  where  the  action 

space  X,  the  discount  factor  a  e  [0,  1)  and  the  state  space  0(6)  are  as  previously  defined.  The  expected  reward 

function  for  arm  k  is  Uk:  ^(©k)  ->  R  given  by  Uk(jJ.k)  =     f  |nc(yk)  6k(dyk)  Mk(d9k)-  TTie  expected  reward  function  u: 

jYk 

ek 

n(6)  X  X  -^  R,  is  defined  by  u((jii ,  H2),  xk)  =  Uk(Wc)- 

To  define  the  transition  probability  cp:  11(0)  x  X  -♦  P{n{S)),  we  need  to  develop  some  notation  for  Bayesian 
updating.  Analogous  to  the  definition  of  T  in  Section  in,  the  arm  k  updating  maps  Tk:  /'(©k)  ^  Yk  ->  /'(©k).  k  = 
1,  2,  are  chosen  so  that-  (i)  Fk  is  jointly  measurable,  and  (ii)  for  each  Hk  ^  ''(©k).  rk(^^k.  )  is  a  regular  version  of 
conditional  probability.  The  updating  maps  Fkn:  /'(©k)  ^^k^  /"(©k)  are  recursively  defined  by:  (i)  Fk^Jik. 
Wk)  =  rk(lik,  7Ck(l.  <%))  and,  (ii)  for  n  >  1.  Tkndtk.  f%)  =  rk(rk,n-l(^tk.  (»k).  tk(n.  "k))- 

Given  that  action  xk  is  chosen  with  prior  marginal  belief  jik,  the  Bayesian's  probability  distribution  over  next 
period  marginal  posterior  beliefs  is  4'K(lik)  e  pH^),  where  the  map  H'k:  /"(©k)  -^  ''^(©k)  is  constructed  as 
follows.  For  ^k  e  ''(©k)  and  B  g  fi(/'(©k)),  define  v>(^k.  B)  =  Fk"  Vk.  •)(B).  Since  \)(Hk.  B)  g  B(Yk).  the  map 
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9k  -*  6k(v(Mk.  B))  is  measurable  and  ^k(Uk)(B)  =    J9k(v(^k.  B))  |Ak(d6k)-  We  can  now  define  the  transition 

ek 

probability  (p:  n(e)  x  X  ->  P(n(,e))  by: 

<p(Oil. U2). xiXAi  X  A2)  =  4'iaii)(Ai)lA2(^2). 

and 

<P((^il. Vi2), X2)(Ai  X  A2)  =  4'2ai2){A2)lAiail). 
for  Ai  e  B{P(ei)),  A2 e  fi(/'(©2)) and  Ia^C)  the  indicator  fiinction. 

A  po/icy  is  a  sequence  {X(t)}  such  that  there  exists  a  (unique)  admissible  plan  [(X(t)),  {Y(t)),  {Ci(t)},  {C2(t)}, 
{//,)].  The  set  of  policies  is  denoted  by  5.  Given  (X(t))  e  E  and  prior  belief  ^l(0)  =  (iii(O),  ^2(0))  e  0(9),  there  is 
an  induced  sequence  (|a(t))  =  {tii(t),  ^2(0)  of  state  variables  (posterior  beliefs),  where  |ik(t):  Q  ->  /'(Bk).  For  t  >  1. 
^k(t)  is  defined  by:  (i)  if  X(t-l)((o)  =  xk,  then  ^k(t)(co)  =  rk(Hk(t-l),  nk(Ck(t).  o)),  and  (ii)  if  X(t-l)(co)  ^  xk.  then 
^k(t)(co)  =  Wc{t-l)(u). 

For  a  6  (0.  1)  and  n(0)  e  0(9),  a  policy  {X(t))  with  corresponding  posterior  sequence  {Oi(t)}  is  (^(0),  a)  DP- 
optimal  if  for  any  policy  {X'(t)}  with  corresponding  posterior  sequence  {li'(O): 


Za'u(n(t)(co), X(t)(a))) Pn(0)(dco)  > 
t=0 


Ia'u(ji'(t)(co),  X'(t)(o)))  P^i(0)(dco). 
t=0 


A  seemingly  obvious,  but  non-trivial  fact  is  that  for  a  6  (0,  1)  an  admissible  plan  [{X(t)),  {Y(t)},  {Ci(t)},  (€2(1)}. 
(//,)]  is  (^,  a)  opumal  (as  defined  in  Section  4.3)  iff  {X(t))  is  (ji,  a)  DP-optimal.  Formally,  this  result  follows 
from  Theorem  7.3  of  Reider  (1975). 

For  a  =  0,  the  above  definition  of  optimality  would  not  useful  since  it  would  impose  no  restrictions  on 
behavior  after  period  0.  Associating  a  =  0  with  repeatedly  myopic  behavior,  we  define  a  policy  {X(t))  with 
corresponding  posterior  sequence  (^(t)}  to  be  (^(0),  0)  DP-optimal  if  for  all  t,  u(ji(t),  X(t))  >  u(>L(t),  x^),  P^(0)  a.s.. 

A  policy  {X(t))  with  associated  posterior  maps  (|J.i(t)),  {|J.2(0}  is  stationary  if  there  exists  ^  policy  function 
%:  n(e)  -»  X  such  that  %  is  measurable  and  for  all  t,  X(t)  =  ^(Hi(t),  H2(t))-  If  (X(t))  is  stationary  and  DP-optimal, 
then  the  policy  function  ^  is  an  optimal  policy  function.  Applying  standard  results  in  dynamic  programming  (see 
e.g.,  Blackwell  (1965)  or  Maioa  (1968))  it  is  routine  to  verify  that  an  optimal,  stationary  policy  exists. 

We  now  restate  some  standard  dynamic  programming  results  in  the  context  of  our  model.  For  discount  factor  a, 
the  value  function  V:  P{Q)  — >  R  is  defined  by 

C°° 
V(n)  =  sup{X(t))e^  {       Za'u(M(t)((0).  X(t)(o)))  P^(do))] 
J  1=0 
Q. 

where  {M.(t))  denotes  the  posterior  sequence  corresponding  to  {X(t)).  V  satisfies  the  Bellman  or  optimality  equation 

(see  e.g.  Blackwell  (1%5,  Theorem  6  (e))): 

Vai)  =  Maxx€x{uai.x)  +  a-     | V(m')  cpOi,  x)(du') ) • 

0(9) 
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Additionally,   ^  is  an  optimal  policy  function  iff  it  solves  the  optimality  equation  (Blackwell  [1965, 
Theorem  6  (01).  That  is  for  all  n  e  0(9): 

VOi)  =  uOt,  ^Ot))  +  a-     Jvat')9(M,^ai))(dM'). 

n(e) 

4.5.  Gittens  Index  Characterization  of  Optimal  Policy 

There  is  a  simple  characterization  of  optimal  policies,  based  upon  initial  results  of  Gittens  and  Jones  (1974),  and 
subsequent  refinements  by  Berry  and  Fristedt  (1985),  Ross  (1983)  and  Whittle  (1982).  Gittins  and  Jones  proved  the 
existence  of  functions  Mi:  /"(Gj)  -»  R  and  M2:  /'(©i)  ->  R  with  the  property  that  it  is  optimal  to  choose  arm  1  in 
period  t  iff  Mi(^i(t))  >  M2(M2(0)-  The  function  Mk  is  commonly  referred  to  as  the  Gittins  Index  for  arm  k. 

To  motivate  the  definition  of  M^  consider  a  one-armed  bandit  problem  where  in  the  initial  stage  the  decision- 
maker has  the  option  of  playing  arm  1  or  stopping  and  collecting  a  terminal  reward  of  m.  In  subsequent  stages, 
assuming  arm  1  has  been  played  in  all  previous  stages,  the  options  remain  selecting  arm  1  or  stopping  and  receiving 
a  final  payment  m.  The  value  of  the  Gittins  Index  for  belief  |ik  is  the  terminal  reward  m  such  that  the  decision-maker 
is  indifferent  between  between  continuing  and  stopping. 

More  precisely,  denote  the  set  of  stopping  times  on  (Qk.  Pk]  as  5^  =  {x:  Q^  -»  No,  T'H(n))  €  Fkn)-  The 
expected  total  reward  of  the  stopping  time  policy  x  with  terminal  payoff  m  and  belief  (ijc  €  /'(©k)  is  given  by  the 
function  Tfe:  ^  x  /'(Ok)  x  R  ->  R,  defined  by 
'  x(o>k)-1 

Tk(t,^ik,m)=       [[     X«'rk(t^t+l)]  +  a^('^)Tn]Pk4ik(d<^- 
t=0 
"k 
The  value  of  the  optimal  policy  is  Vk(Hk,  ni)  =  Sui>x6.^(Tk(t,  Hk.  n™)}-  Recalling  that  Ck  =  ess  sup(rk(yk)).  it  is 

routine  to  verify  that  for  all  ^k  ^  ^(^).  Vk()ik.  m)  =  m  for  all  m  >  Ck-(1  -  a)"^.  The  Gittins  Index  is  defined  by 

Mk(Mk)  =  inf{m:  Vk(p.k.  m)  =  m).  A  characterization  of  the  optimal  policy  in  terms  of  the  Gittins  Index  is  given  by 

the  following  proposition. 

PROPOSITION  4.1.  A  policy  (X(t))  with  posterior  maps  (m(t)).  (^2(0)  is  OnCO),  [i2(0))-optimal  iff  X(t)(ci))  =  xk 
whenever  Mk(M.k(tXw))  >  Mj(jij(t)(a))). 
Proof.  WhitUe  (1982,  Theorem  14.4.1).  ■ 

Throughout  the  remainder  of  the  paper  we  assume  that  the  Bayesian  controller  follows  an  optimal  stationary 
policy  {X(t))  with  posterior  maps  {^ll(t)),  {^2(1)),  and  poUcy  function  ^:  11(0)  ->  X  defined  by  ^(()ii(t),  H2(t)))  =  M 
iff  Mi(p.i(t))  >  M20J.2(t)).  For  a  more  detailed  development  of  the  Gittens  Index  and  the  optimality  of  the  Index 

policy,  the  texts  of  Ross  (1983),  Whitde  (1982),  and  Berry  and  Fristedt  (1985)  are  recommended 
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5.  GENERIC  LIMIT  THEOREMS 
5.1.  Some  Continuity  Results 

Preparatory  to  proving  the  genericity  theorems,  some  preliminary  technical  results  are  developed  in  this 
subsection.  The  principal  result  is  the  establishment  of  the  continuity  of  the  functions  Mi  and  M2.  The  first  step  is 
to  develop  characterizations  of  Vi  and  V2  by  making  use  of  the  fact  that  these  value  functions  are  solutions  to  the 
optimality  equation. 

LEMMA  5.L  The  function  V^:  P{'&)d  x  R  ^  R  w  continuous,  for  k  =  1,  2. 

Proof.  Without  loss  of  generality  we  consider  the  case  of  k  =  1.  We  first  note  that  an  easy  consequence  of  Lemma 
A.5.  is  that  *Pi  is  continuous.  Let  C(f(6i))  denote  the  space  of  bounded,  continuous  real-valued  functions  on  P{Q\) 
endowed  with  the  sup  norm.  Since  u(^i,  ^2.  ''l)  is  independent  of  |i2.  we  may  define  uiOij)  =  u(jii,  p.2.  ^l)-  For 
each    me   R,  define  the  mapping  -dm:  C{P{Q\))  -»   C(f(8i))  by  i^mCOCM^l)  =  max{m,  ui(p.i)  + 

a-  k(Yl)  ^l(Ul)(dYi)).  (The  continuity  of  i3m(C("))  follows  from  the  continuity  of  ui  and  I'l.  )  By  standard 

arguments  one  can  verify  that  -6^.  'S  a  contraction  mapping  with  modulus  a.  Therefore  the  fixed-point  of  -Q^  is  a 
continuous  function  of  m.  Since  Vi(-,  m)  is  the  fixed-point  of  Em,  the  map  (p.i,  m)  — >  Vi^Oij,  m)  is  continuous.  ■ 

LEMMA  5.2.  For  k  =  1,2,  and  \i\^  e  /"(Bk),  the  map  m  — >  [V^diij,  m)  -  m]  is  decreasing. 

Proof.  Replacing  sums  with  integrals,  the  proof  of  Ross  (1983,  Lemma  VII.2.1)  remains  valid.  ■ 

Recall  that  4'k(lik)  is  the  Bayesian's  probability  distribution  over  next  period  beliefs  on  ©k.  given  that  arm  k  is 
chosen   and   |i.ic    is   the  current  belief.   Define  W^: /'(6k)  x  R  ->  R    by  Wk(tik.  "i)  =  Uk(^k)  + 

«■     jVk(|ik'.  m)  *i'k(M-k)(Mk')-  Wk(Wc.  m)  is  the  expected  reward  for  the  one-armed  bandit  problem  with  the  option 
P{^ 

of  stopping  and  receiving  a  payoff  of  m  after  the  initial  period. 

LEMMA  5.3.  (i)  Wj  and  W2  are  continuous,  (ii)  Mk()J4c)  <^iff  Wk(]Xk,  m)  <  m. 
(i)  This  follows  from  the  continuity  of  4'k  and  Lemma  5.1. 

(ii)  Suppose  Wk(u,  in)  <  m.  By  definition  of  Wk,  m  >  Uk(M)  +  "•  J  Vk(M-',  m)  4'k(li)(d^i')-  Since  Vk  is  continuous 

(Lemma  5.1)  and  increasing  in  m,  ( iri  -  e>  ukdi)  +  aj  Vk(M'.  m  -  e)  9k(U)(d|J.')  for  e  sufficiently  small.  But  this 
last  inequality  is,  by  definition,  equivalent  to  Mk(ji)  <  m  -  e  <  m. 

For  the  converse,  suppose  Mk(M)  =  m  <  m.  Since  m  is  a  terminal  payoff  for  which  the  decision-maker  is 
indifferent  between  continuing  play  and  quitting,  Wk(Ji,  m)  =  Vk(ji,  m)  =  m.  From  Lemma  5.2,  for  all  )i  e  P(ek), 
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Vk(ji,  m)  -  Vk(^i,  m)  <  (m  -  m),  implying  that  /[Vk(n',  m)  -  Vk(|i',  m)]  cpkOiXd^i')  <  (m  -  m).  Finally,  we  have 

that; 

Wk(M,  m)  =  uk®  +  a- jVk(M'.  m)  ^PkC^LXduO 

=  uk(M)  +  a{JVk(n'.  m)  4'k(ii)(d^')  +  JVkOt',  S)  nODCduO  -  JVkOi'.  m)  ^(iiXd^')) 
=  {uk(M)  +  aJVk(^',  S)  4'k(ii)(dM')}  +  a{JVkai'.  m)  H'kOiXd^')  -  JVkOi'.  m)  4'k(^iXd^l')) 
=  Si  +  a-  { JVk(n'.  m)  4'kai)(d;t')  -  /Vk(n'.  m)  ^kOiXd^')) 

<  m  +  a(ni  -  in) 

<  m.  ■ 

PROPOSITION  5.4.  Mi  and  M2  are  continuous. 

Proof.  First  we  establish  that  M^  is  lower-semicontinuous  by  verifying  that  for  all  c  g  R  the  set 
[\i  e  P{Q\i):  MkOi)  >  c)  is  open.  Suppose  that  MkCJi)  >  c.  By  definition  of  Mk,  VkCJI,  c)  -  c  >  0.  From  the 
continuity  of  Vk  (Lemma  5.1),  3  an  open  neighborhood  /V  of  ji  such  that  for  all  ^  g  TV,  Vk(jL,  c)  -  c>  0.  Therefore, 
for  all  |i  G  N,  VkOi.,  c)  -  c>  0  and  so  Mk(n)  >  c. 

Upper-semicontinuity  is  confirmed  by  demonstrating  that  the  set  (|i  g  /"(©k):  MkOx)  <  c)  is  open  for  all  c  g  R. 
Suppose  that  Mk(ii)  =  m  <  c.  By  Lemma  5.3,  Wk(M.,  c)  <  c  and  3  an  open  neighborhood  /  of  [I  such  that  for 
H  G  7,  WkOi,  c)  <  c.  But  this  implies  that  Mk(ji)  <  c.  ■ 

5.2.  Generic  Outcomes  when  bi  5"!  b2 

In  this  subsection  we  analyze  the  limit  behavior  of  the  decision-maker  for  the  case  where  bi  ?!:  b2  (recall  that  bk 
=  ess  inf  {rk(yk):  yk  ^  Yk)).  Without  loss  of  generality  we  assume  that  bi  <  b2.  To  motivate  the  next  result,  choose 
a  set  Yo  c  Yi  with  vi(Yo)  >  0  and  sup{ri(yi):  yi  g  Yq)  <  b2.  So  if  at  time  t,  the  decision-maker's  conditional 
probability  of  an  outcome  in  Yq  occurring  if  arm  1  is  selected  is  sufficiently  large,  arm  2  will  be  selected  regardless 
of  H2(0;  and  so  arm  2  will  be  selected  at  all  times  t'  ^  L  From  Theorem  3.10  we  can  conclude  that  there  is  a  residual 
seti^i  eGi  x /"(Si)  with  the  following  property.  If(0i,|ii)G  ^i,m  is  the  prior  belief  on  (61,  B(©i)),  and  arm 
1  is  played  sufficiently  often,  the  decision-maker  will  eventually,  Poj  a.s.,  become  sufficiently  pessimistic  that  arm 

1  will  never  be  tested  again.  Consequently,  arm  1  will  be  played  only  finitely  often. 

To  formalize  the  above  remarks,  we  begin  by  defining  Yq  as  above.  Choose  mo  g  /'(61)  such  that  M-lo(Yo)  = 
1.  For  e  >  0  define  an  open  neighborhood  Ge  c  /'(©i)  of  mo,  by  Ge  =  (m  g  P(Q\):  P()ii,  mo)  <  e) . 

LEMMA  5.5.  There  exists  oO,  such  that  for  all  z<c,  supmeGglMidli)]  <  inf^2eP(©2)^^2(^l2)}• 
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Proof.  First  observe  that:  (i)  inf  (M2(M2):  W  g  /"(©l))  =  ^.  and  (ii)  Mi(nio)  <  ^I^  .  Since  Mi  is  continuous,  J  = 
{|il  €  /'(Si):  MiOii)  <  T^}  is  open.  So  3  c  such  that  for  e  <  c,  Ge  c  J.  And  so  for  m  g  Gg  c  J,  and  any  |i2  g 
P{Q2)>  MiOil)  <  ^  ^  M2ai2)-  ■ 

PROPOSITION  5.6.  Suppose  \>\  <  b2.  Then  there  exists  a  residual  subset  ^i  c  ©i  x  f  (©i)  such  that  for 
(01,  ni(0))  e  ^i  and  ail  (82,  H2(0))  e  ©2  x  ^(©2).  P9  a-^-  X(t)(o})  =  xi  only  finitely  ofien. 
Proof.   Invoking  Lemma  5.5,  choose  an  open  set  Gg  such  that  Mi(m)  <  M2(^2)  for  all  Oii,  (12)  ^  Gg  x  /'(©2). 
Define  the  set  QiecQi  by  nie=  (coi  e  fii:  rin(ni(0),  ©i) «  Gg.  foralln=  1,2,  ...}.  Defme^i  =  ((9i, 
Hl(0))  c  ©1  xP(©i):  Pej(Qie)  =  0}.  By  Theorem  3.10,  i?i  is  residual.  ■ 


5.3.  Generic  Outcomes  when  bi  =  b2. 

When  bi  =  b2,  the  "typical"  outcome  will  be  that  the  decision-maker  will  start  out  playing  one  arm,  but  will 
eventually  become  sufficiently  pessimistic  regarding  the  first  choice  and  switch  to  the  other  arm.  Eventually,  though 
she  will  become  sufficiently  regarding  the  non-initial  arm  and  switch  back  to  the  original  arm.  This  switching  back 
and  forth  will  continue  forever. 

PROPOSITION  5.7.  Suppose  bi  =  b2  =  b  and  vk({yk:  ric(yk)  ^i^  b))  >  0.  Then  there  is  a  residual  setR  czBx  n(©) 
such  that  Pe  a.s.,  X(t)  =  xj  infinitely  often  and  X(t)  =  X2  infinitely  often. 

Proof  \je.lRk  =  {(Ok,  tik(O))  e  ©k  x  Pk(©):  POk  a.s,  limsupnrkn(^k(0),  C0k)(G)  =  1,  for  all  open  G  c  ©k).  By 
Theorem  3.10,  Rk  is  residual,  and  so  7?  =  ^7  x  /?2  is  residual.  Now  choose  (9i,  ^i(O),  02,  |i2(0))  e  R,  and  define  Ei 
=  {co:  supt  Ci(t)((o)  is  finite).  Ei  is  the  (O-set  for  which  arm  1  is  played  only  finitely,  given  the  optimal  policy 
starting  from  beliefs  Oii(O),  |i2(0))-  Because  of  the  symmetry  of  the  specification,  it  suffices  to  fffove  that  Pe(Ei)  = 

0.  For  0)  €  El,  define  the  terminal  value  of  the  Gittens  index  for  arm  1  as  mioo(co)  =  Mi(ri_Ci(t)(co)(M^l(0).  Wi)). 
Since  (9i,  ni(0))  e  Ri,  Pe(Ei  n  (co:  mi„(ci))  =  ^  ))  =  0.  Now  let  E2  =  ((0)1,0)2):  Iimsupnr2nat2(0),  0)2)(G)  = 

1,  for  all  open  G  c  ©2),  and  by  Theorem  3.10,  Pe(E2)  =  1.  By  the  continuity  of  M2,  Ei  c  (Ei  n  (co:  mi„(a))  = 
Y^})u-E2,andsoP9(Ei)  =  0.H 

6.  CONCLUDING  REMARKS 
In  Section  3,  we  demonstrated  the  asymptotic  sensitivity  of  posterior  beliefs  with  respect  to  a  prior  priobability 
on  a  parameter  space  which  is  not  finite  dimensional.  These  results  have  significance  for  Bayesian  statistical  decision 
theory.  Since  posterior  beliefs  are  not  robust  to  small  perturbabtions  of  the  prior,  the  optimal  action  correspondence 
is  similarly  non-robusL  If  two  Bayesians  have  identical  objective  or  loss  functions,  nearby  prior  beliefs  and  observe 
the  same  sequence  of  outcomes,  without  additional  restrictions,  it  would  not  be  pathological  for  them  to  each  choose 
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actions  that  the  other  evaluated  as  markedly  inferior  to  their  own  choice.  The  bandit  example,  developed  in  Sections 
4  and  5  of  this  paper,  illustrates  this  phenomena  in  a  context  where  beUefs  do,  in  fact,  affect  outcomes. 

For  stochastic  processes  that  might  be  generated  by  more  complex  feedback  between  beliefs  and  actions,  such 
as  rational  expectations  models  with  learning,  no  formal  demonstration  is  provided  of  generic  nonconvergence  of 
beliefs  and  outcomes.  Nevertheless,  it  would  be  surprising  (at  least  to  this  author)  if  the  additional  complexity  of 
such  models,  somehow  restored  the  aymptotic  regularity  attained  in  analyses  conducted  firom  the  perspective  of  the 
Bayesian  learner. 

Economic  theorists  who  wish  to  model  learning  by  Bayesian  economic  agents  are  confronted  with  three  options. 
One  alternative  is  to  impose  no  restrictions  upon  agent  beliefs  and  analyze  the  resulting  stochastic  process  from  the 
perspective  of  a  classical  statistician  who  knows  the  true  parameter.  With  sufficiently  specific  assumptions  on  other 
parameters  of  the  model  (such  as  preferences  and  technology)  it  may  be  possible  show  that  agents  with  consistent 
priors  will  eventually  be  financially  dominant-',  and  presumably  then  the  limiting  price  process  would  be 
indistinguishable  from  the  price  process  emanating  from  a  model  in  which  all  agents  had  consistent  priors.  Even  so, 
there  would  remain  the  question  as  to  whether  any  such  asymptotic  properties  would  be  robust  under  arbitrarily  small 
perturbations  of  parameter  values  and  initial  beliefs. 

A  second  alternative  would  be  to  follow  the  currently  predominant  practice  of  adopting  the  probabilistic  vantage 
point  of  the  Bayesian  agent(s),  imposing  no  restrictions  on  agent  beliefs.  A  drawback  with  this  approach  is  that  it 
provides  no  basis  for  drawing  distributional  inferences  for  any  particular  parameter  value  of  agent  prior  probability 
zero.  The  resulting  Umit  theorems  can  be  interpreted  as  predictions  by  the  economic  theorist  only  if  the  theorist's 
beliefs  are  absolutely  continuous  with  respect  to  the  agent's  beliefs.  So  unless  the  reacter  also  has  beliefs  absolutely 
continuous  with  respect  to  the  agent's,  there  are  no  grounds  for  accepting  the  agent's  asymptotic  predictions.^ 

The  fmal  option,  one  that  I  endorse,  is  to  narrow  the  set  of  candidate  prior  beliefs,  a  strategy  that  is  adopted  by 
Bikhchandani  and  Sharma  (1990).  To  motivate  this  strategy,  consider  the  bandit  problem  studied  in  Sections  4  and  5, 
and  for  simplicity  suppose  that  Yi  and  Y2  are  each  countable.  If  I  was  the  decision-maker,  I  might  fmd  it  difficult  to 

exactly  specify  my  prior,  but  I  would  reject  any  prior  belief  for  which  there  was  a  residual  0-set  A,  such  that  for  all  0 
€  A,  with  P9  probability  one  my  beliefs  would  not  converge.  In  particular,  I  would  require  that  for  any  arm  played 

infinitely  often,  the  Prohorov  (or  bounded  Lipschitz)  distance  between  my  posterior  beliefs  and  the  sample 
distribution  converge  to  zero.  And  while  acknowledging  that  introspective  reasoning  has  its  limitations,  I  believe 
that  few  individuals  would  behave  as  predicted  by  Propositions  5.6  and  5.7.  More  generally,  in  environments  where 
consistent  estimators  are  available,  the  modeller  should  assume  that  priors  are  chosen  from  the  family  of  probability 
measures  that  are  consistent  for  all  9  e  0. 

This  is  philosophically  similar  to  the  "what  ir  method  advocated  by  Diaconis  and  Freedman  (1986)  for 
Bayesian  statisticians.  Diaconis  and  Freedman  suggest  that  "...  after  specifying  a  prior  distribution,  generate 
imaginary  data  sequences,  compute  the  posterior,  and  consider  whether  the  posterior  would  be  an  adequate 
representation  of  the  updated  prior."  Adapting  this  reccommendation  to  the  context  of  economic  modelUng,  it  would 
be  natural  to  require  that  agents  have  prior  beliefs  with  full  support  and  that  the  sequence  of  agent  posterior  beliefs 


-19- 

converges  almost  surely  with  respect  to  the  measure  P9  for  all  9  €  9.  The  feasibility  of  such  a  strategy  requires  the 

existence  of  consistent  priors.  Unfortunately,  endogenous  learning  will  typically  imply  a  non-stationary  stochastic 
process.  And  there  are  few  results  currently  available  on  the  Pq  consistency  (or  convergence  properties)  of  Bayes 
estimates  fw  such  processes.^  Further  research  on  sufficient  conditions  on  priors  for  Pq  consistency  is  needed. 
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Appendix 

For  simplicity  we  explicitly  prove  Proposition  3.5  for  n  =  1.  The  argument  for  the  n-fold  product  space  is 

essentially  identical  and  is  briefly  sketched. 

We  start  by  defining  for  A  €  fi(Z),  the  function  f^:  A  ^  R  by  f  a(^)  =  Jh(X,  z)  v(dz). 

A 

LEMMA  A.l.  fA  w  a  bounded Lipschitz function,  and  llfAllBL  ^  3. 

Proof.  The  Lipschitz  norm  is  defined  by  llfAllL  =  sup5^^;^'{lfA(X)  -  fAC^OI/CdlCX,  X')].  Since 

IfAM  -  fAi^y  ^  IHK  z)  -  h(X',  z)l  v(dz)  <  2di(X,  XO  <  2, 
A 

the  Lipschitz  norm  exists  and  llfAllL  ^  2.  Since  llfAlloo  =  supji  lfA(X)l  =  1,  the  bounded  Lipschitz  norm  llfA"BL  = 
l'fA"L  +  llfAlloo  exists  and  HfAllBL  ^  3.  ■ 


LEMMA  A.2.  For  ^i,  \i  €  Pi\)  if  Hi  =>  1^  then  {supA[l  |fA  d(^ii  -  H)I:  A  e  fi(Z)}  ->  0. 

A 

Proof.  Recall  that  the  Dudley  metric  P  is  defined  by  P()i,  y)  =  sup{ljf  d(n  -  y)I:  llflleL  ^  1 )  and  that  ^j  =>  ^  iff  P(Hi, 

(J.)  -*  0.  So  by  Lemma  A.l,  |Xn  =>  H  implies  {supA[l  JfA  d(Mn  -  M^)l:  A  e  S(Z))  -»  0.B 

A 

For  H  €  P{A)  define  the  probabUity  P^  on  (Z  x  A.  fi(Z)  x  ^CA))  by  Pji(A  x  B)  =  jfA(X)  p.(d>.).  Also  define  the 

B 

marginal  probability  Q^  on  (Z.  B(Z))  by  Q|i(A)  =  P^(A  x  A).  Now  fix  a  sequence  |ii  =>  h  e  P(A)  and  define  Li(z)  = 

Jh(X,  z)  |ii(dX)  and  L(z)  =  jh(K,  z)  tJ.(dX).  Li(z),  L(z)  are  the  Radon-Nikodym  derivatives  of  Q^-  and  Q^  with 
A  A  ^ 

respect  to  v. 

LEMMA  A.3.  Li()  -4  L()  in  v measure. 

Proof  |Li(z)  v(dz)  =    r  Jh(X,  z)  ni(dX)  v(dz)  =    f  |h(X.  z)  v(dz)  tii(dX) 

A  Ja  Ja 

A  A 

=  jfA(X)  ^li(dX)  ^  JfAa)n(dX)=  JUz)v(dz). 
AAA 

Furthermore,  by  Lemma  A.2,  the  convergence  is  uniform  in  A.  The  conclusion  follows  from  Remark  3.1  of 

[Strasser,  1985  #57].  ■ 

For  B  6  B(A),  define  Li(B,  x)  =  Jh(X,  z)  w(dX)  and  L(B,  z)  =    jh(X,  z)  ^l(dX) 

B  B 

LEMMA  A.4.  If  B  e  B(A)  and  n(aB)  =  0,  then  Li(B,  z)  ^  L(B,  z)  in  v  measure. 

Proof.  First  consider  the  case  where  |i.(B)  >  0.  It  suffices  to  verify  that  I  JLjCB,  z)  v(dz)  -  JL(B,  z)  v(dz)l  ->  0 

A  A 

uniformly  in  A.  Let  B(B)  denote  the  Borel  sets  restricted  to  B.  Define  (iiCB)  =  ai;  and  define  y[  on  (B,  fi(B))  by  Yi(C) 


-21- 

=  ai"' ^i(C),  for  C  e  B(B).  Similarly,  define  a  =  ^(B)  and  define  y  on  (B,  B(B))  by  7(C)  =  sr^\i(C).  Observe  that 

|Li(B.  z)  v(dz)  =    ffA(X)  mCdX)  and  that  /L(B.z)v(dz)=    jfA(X)  ^l(dX).  So 
A  B  A  B 

I  jLi(B,  z)  v(dz)  -    JL(B,  z)  v(dz)l 
A  A 

=  l|fAWaiy.(dX)-   JfAW7(d>.)l 
B  b 

<  I  JfA(X)ai  r.(dX)  -  |fA(X)  ri(d>.)l  +  I  IfA(X)  Yi(d>.)  -    JfA(>.)  "Kd^^)!. 
B  B  B  B 

Since  IfAl  is  bounded,  I  jfA(^)ai  TiCdX)  -  JFaC^)  yi(d>-)l  ->  0  independenUy  of  A.  Substituting  Ti  and  y  for  (li  and  n 
B  B 

in  Lemma  A.2..  supA€B(A){l  JfA(^)ai  Yi(dX)  -    |fA(>-)  7(d>-)l)  ^  0. 

B  B 

If  ^(B)  =  0  then  observe  that  Li(z)  =  Li(B,  z)  +  Li(~B.  z)  and  L(z)  =  Li(B,  z)  +  Li(-B,  z).  Since  B  is  a  n- 

continuity  set,  -B  is  also  a  n-continuity  set.  So  by  application  of  the  above  and  Lemma  A.3,  Li(B,  ■)  ->  L(B,  •)  in 

V  measure.  ■ 

LEMMA  A.5.  Suppose  [i  e  P(A)  and  pick  e  >  0.  Let.^=  {Ai,  A2,  ...)  be  a  disjoint  cover  of  A,  with  diameter  of 

e  t' 

At  <  8  =  -  Defuie  Bj'  =  U(lj  At,  and  choose  T  so  that  h(Bt)  >  1  -  8.  If  y  g  P(A)  and 

^{Iti(AL)  - 7(At)l}  <  ST-l,  then  P(^, y) <  e. 

Proof.  Let  g  e  BL(A,  di)  with  llgllfiL  ^  1-  It  suffices  to  prove  that  I  jg  d^i  -  jg  dyl  <  e.  Define  af  =  inf{g(X):  X  e 

A  A 

£ 

At),  define  bt  =  sup  {giX):  Xe  At),  and  note  that  bt  -  at  <  8.  Since    jg  dy  >   [ti(At)  -  z;]at,  and    Jg  dii  < 

At  *  At 

[M,(At)]bt,  it  follows  that    jg  d|i  -     jg  dy  <  8(i(At)  +  ~ 

At  At 

Similarly,    jg  dy  -   jg  d^  <  8n(At)  +  |.  So  I  jg  dy  -   jg  dul  <  8ii(At)  +  |, 
At  At  At  At 

and  I  jg  dy  -   jg  dnl  <  8|i(BT)  +  8  <  28. 

Br        &r 

It  remains  to  bound  the  difference  of  the  integrals  over  -Bj.  Since  0  <  7(~Bt)  <  28,  I   jg  d^.  -     jg  dyl  <  28 

~&r        -Br 

+  8.  So  I  jg  dn  -  jg  dTl  <  58  =  E.  ■ 
A  A 


PROPOSITION  A.6.  Suppose  n,  ^i  e  P+{A)  and  \ii  =>  [JL.  Then  r(^i.  z),  r(pi,  z)  e  P+CA)  v  a.s.,  and 
r(W.)  -^  f  (M.)  in  v-measure. 

Proof.  Verification  that  r(^i,  z),  r{\i,  z)  e  /'+(A)  is  routine  and  so  omitted.  Pick  e  >  0,  and  as  in  Lemma  A.S,  let 
~^=  { Ai,  A2, ...  )  be  a  disjoint  cover  of  A  with  diameter  of  At  <  8  =  7  and  |J.(9At)  =  0.  (The  existence  of  such  a 

collection  follows  from  Theorem  1 1.7.3.  of  Dudley  (1989).)  Define  Bt  as  in  Lemma  A.S,  and  choose  T  s.L  M.(Bt)  > 
1  -  8.  Observe  that  v  a.s.,  r(p.i,  z)(At)  =     \  /',     ,  which  is  well-defined  since  ^i  e  P+(\)  and  so  Li(z)  >  0. 
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Applying  Lemmas  A.3  and  A.4,  we  have       , '    ->       , '    in  v-measure.  Since  Bj  is  the  union  of  a  finite  number 

of  At  sets,  by  Lemma  A.5,  we  have  limsupi  v({z:  PCTOij,  z),  FOi,  z))  >  e}  =  0.  Since  e  is  arbitrary,  M(Z„P(rOii,  z), 
rai,z)))  v(dz)  ->  0.  ■ 

PROPOSITION    A. 7.   Suppose   H,  Hi  e  /'+(A),  |ij  =>  |i,  f:  A  -^  R    is   bounded   and   continuous,  and 
Xi  ->  X  €  (A,  di).  Then      f  J[f(X)  r(ni,  z)(dX)]  h(Xi,  z)  v(dz)  -*    r|[f(X)  ^(^l,  z)(dX)]  h(X.  z)  v(dz). 

Z  Z 

Proo/.  Define  9,,  (p:  Z  ->  R  by  9i(z)  =  Jf(X)  TOi.,,  z)(dX)  and  9(z)  =    Jf(X)  rCp.,,  z)(dX).  By  Proposition  A.6,  9; 

A  A 

^  9  in  v-measure.  Since  19;!  and  l<J)l  are  uniformly  bounded  by  llfll,  and  h(Xi,  •)  ->  h(X,  •)  in  L^(Z,  B(Z),  v), 

9ih(Xi,  •)  -*  9h(X,  •)  in  L^(Z,  B(Z),  v).  So  j(pi(z)h(Xi,  z)  v(dz)  — >    |9(z)h(X,  z)  v(dz),  or  equivalently, 

Z  Z 

r  j[f(X)  r(Hi,  z)(dX)]  h(Xi,  z)  v(dz)  ->    r|[f(X)  rOt,  z)(dX)]  h(X,  z)  v(dz).  ■ 


Proof  of  PROPOSITION  3.5.  Fix  k  and  m.  For  n  =  1,  setting  f  =  gicmn.  the  result  follows  directly  from  Proposition 
A.7.  For  n  >  1,  repeat  the  above  arguments  replacing  z  with  a  and  v  with  v".  ■ 


FOOTNOTES 

^Some  authors  allow  for  heterogeneous  beliefs  across  agents. 

^The  definition  of  an  optimal  plan  when  a  =  0  will  be  provided  in  the  next  subsection. 

^I  am  indebted  to  Neil  Wallace  for  this  observation. 

^The  above  remarks  are  also  applicable  to  nonparametric  Bayesian  econometrics.  If  the  reader  of  a  Bayesian  statistical 

analysis  has  a  prior  not  absolutely  continuous  with  respect  to  the  author's  prior  (and  the  prior  is  not  consistent  for 

all  parameter  values),  even  with  large  samples  there  may  be  no  merging  of  posterior  beliefs.  For  a  more  complete 

discussion  of  these  issues  is  provided  by-Diaconis  and  Freedman  (1986). 

^Barron  (1988)  has  recendy  proven  results  for  stationary  stochastic  processes.  Barron's  techniques  are  potentially 

extendable  to  processes  where  endogenous  learning  generates  the  non-stationarity. 
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